Anyone who has considered donating money to a political campaign may wonder, “what happens to my contribution?” According to OpenSecrets.org, the total amount of money spent by all candidates and parties during the 2012 presidential election amounted to approximately $2.6 billion. This amount is still only about 0.4% of what was spent by the U.S. military in 2015, but that’s another story.

In terms of the 2016 presidential election, the data set that we used, provided by the Federal Election Commission, says that over $420 million has been spent already – and that’s only between April 1 and September 15, 2015, so there’s still over 6 months to go.

So where do all of these dollars go when you support a candidate? We decided to find out!


Part I: Nation-Wide Expenditures

First, we looked at the 100 largest expenditures and determined the candidates whose transactions made up this list. The top 4 were:

Then, we looked at when these 100 expenditures were made and under which spending categories they fall:

require(mdsr)
## Loading required package: mdsr
## Loading required package: mosaic
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: lattice
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.2.4
## Loading required package: car
## Loading required package: mosaicData
## 
## Attaching package: 'mosaic'
## The following object is masked from 'package:car':
## 
##     logit
## The following objects are masked from 'package:dplyr':
## 
##     count, do, tally
## The following objects are masked from 'package:stats':
## 
##     binom.test, cor, cov, D, fivenum, IQR, median, prop.test,
##     quantile, sd, t.test, var
## The following objects are masked from 'package:base':
## 
##     max, mean, min, prod, range, sample, sum
require(magrittr)
## Loading required package: magrittr
require(ggplot2)
require(lubridate)
## Loading required package: lubridate
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:mosaic':
## 
##     interval
file1 <-"~/Desktop/zipcode.csv"
file2 <- "~/Desktop/expenditures.csv"
zipcode <-read.csv(file1)
expenditures <- read.csv(file2)
t <- expenditures %>%
  summarize(t = sum(disb_amt))
View(t)

expenditures100 <- expenditures %>%
  arrange(desc(disb_amt))  %>%
  mutate(date = dmy(disb_dt)) %>%
  select(cand_nm, date, disb_amt, recipient_st, disb_desc)

Not surprisingly, media and salary-related spending fall are among the top 10 categories for expenditures. According to a NYT article from August 2015, TV advertisements are not actually that effective at changing voters’ minds; while they reach far more people than any other type of media – around 87 percent of people over 18 – it is a method with diminishing returns that comes down to quantity over quality.


Next, we looked at these 4 candidates expenditures to see where (geographically) their money went between April and September 2015.

  1. Bernard Sanders:
Sanders_expenditures <- expenditures %>%
  filter(cand_nm == "Sanders, Bernard") %>%
  select(disb_amt, recipient_st) %>%
  group_by(recipient_st) %>%
  summarize(single_transaction_amount = sum(disb_amt)) %>%
  rename(state = recipient_st)

LongLatSanders <- Sanders_expenditures %>%
  left_join(zipcode, by = "state")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
ggplot(data = LongLatSanders, aes(x = longitude, y = latitude)) + 
  geom_point(aes(color = single_transaction_amount), size = 1) +
  scale_x_continuous() + 
  scale_y_continuous() +
  geom_text(x = -83, y = 60, label = "Young Voter Hot-Spot: \n over 70,000 students attend a school \n within the MA state university system", fontface = "italic", 
            size = 3) +
  geom_curve(x = -70, xend = -70,
             y = 42, yend = 53, 
             curvature = 0) +
  scale_colour_gradientn(colours = rainbow(7)) +
  theme(plot.title = element_text(size = 15),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank()) +
  ggtitle("Presidential Candidate National Expenditures: Bernard Sanders")
## Warning: Removed 2 rows containing missing values (geom_point).

  1. Hillary Clinton:
Clinton_expenditures <- expenditures %>%
  filter(cand_nm == "Clinton, Hillary Rodham") %>%
  select(disb_amt, recipient_st) %>%
  group_by(recipient_st) %>%
  summarize(single_transaction_amount = sum(disb_amt)) %>%
  rename(state = recipient_st)

LongLatClinton <- Clinton_expenditures %>%
  left_join(zipcode, by = "state")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
ggplot(data = LongLatClinton, aes(x = longitude, y = latitude)) + 
  geom_point(aes(color = single_transaction_amount), size = 1) +
  scale_x_continuous() + 
  scale_y_continuous() + 
  scale_colour_gradientn(colours = rainbow(7)) +
  theme(plot.title = element_text(size = 15),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank()) +
  ggtitle("Presidential Candidate National Expenditures: Hillary Clinton")
## Warning: Removed 2 rows containing missing values (geom_point).

  1. Marco Rubio:
Rubio_expenditures <- expenditures %>%
  filter(cand_nm == "Rubio, Marco") %>%
  select(disb_amt, recipient_st) %>%
  group_by(recipient_st) %>%
  summarize(single_transaction_amount = sum(disb_amt)) %>%
  rename(state = recipient_st)

LongLatRubio <- Rubio_expenditures %>%
  left_join(zipcode, by = "state")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
ggplot(data = LongLatRubio, aes(x = longitude, y = latitude)) + 
  geom_point(aes(color = single_transaction_amount), size = 1) +
  scale_x_continuous() + 
  scale_y_continuous() + 
  scale_colour_gradientn(colours = rainbow(7)) +
  theme(plot.title = element_text(size = 15),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank()) +
  ggtitle("Presidential Candidate National Expenditures: Marco Rubio")
## Warning: Removed 4 rows containing missing values (geom_point).

  1. Donald Trump:
Trump_expenditures <- expenditures %>%
  filter(cand_nm == "Trump, Donald J.") %>%
  select(disb_amt, recipient_st) %>%
  group_by(recipient_st) %>%
  summarize(single_transaction_amount = sum(disb_amt)) %>%
  rename(state = recipient_st)

LongLatTrump <- Trump_expenditures %>%
  left_join(zipcode, by = "state")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
View(LongLatTrump)
ggplot(data = LongLatTrump, aes(x = longitude, y = latitude)) +
  geom_point(aes(color = single_transaction_amount), size = 1) + 
  scale_x_continuous() + 
  scale_y_continuous() + 
  scale_colour_gradientn(colours = rainbow(7)) +
  theme(plot.title = element_text(size = 15),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank()) +
  ggtitle("Presidential Candidate National Expenditures: Donald Trump")

While there are some notable differences in candidate spending (for example, Trump was the only candidate to spend money on just the contiguous 48 states), there are also similarities:

Looking at a map of delegates per state can help tell us why certain states are in the upper spending range for most candidates:

Clinton and Trump, currently leading their respective parties in the delegate race, both led in spending on California – so is it a simple cause-and-effect relationship? Maybe not; while Sanders and Trump led spending in Virginia, Clinton and Trump took the state primary.


Part II: Massachusetts Contributions

Where are candidates’ funds coming from in the state of Massachusetts?

We decided to look at the two candidates who are currently leading in the primaries for their political party in terms of delegates won, using data that spanned from August 2015 to January 2016.

require(rvest)
## Loading required package: rvest
## Loading required package: xml2
require(ggmap)
## Loading required package: ggmap
## 
## Attaching package: 'ggmap'
## The following object is masked from 'package:magrittr':
## 
##     inset
require(scales)
## Loading required package: scales
## 
## Attaching package: 'scales'
## The following object is masked from 'package:mosaic':
## 
##     rescale
require(RColorBrewer)
## Loading required package: RColorBrewer
file3 <- "~/Desktop/MAcontributions.csv"
PContri <- read.csv(file = file3, header = TRUE)
  1. Hillary Clinton:
RunDemocrat<-PContri %>%
  filter(cand_nm == "Clinton, Hillary Rodham")%>%
  rename(name = cand_nm,
         AmountContr = contb_receipt_amt,
         occupation = contbr_occupation,
         city = contbr_city)%>%
  select(-receipt_desc,-memo_cd,-memo_text,-form_tp,-file_num,-tran_id,-cmte_id)
View(RunDemocrat)
RunDemocrat %>%
  select(name,city,AmountContr)%>%
  group_by(name)%>%
  summarise(N= n(), TotalContr = sum(AmountContr))
## Source: local data frame [1 x 3]
## 
##                      name     N TotalContr
##                    (fctr) (int)      (dbl)
## 1 Clinton, Hillary Rodham  8458    4042909
RunDemocrat %>%
  select(name,city,AmountContr)%>%
  group_by(city)%>%
  summarise(N=n(), highestContr = max(AmountContr))%>%
  dplyr::arrange(desc(N))
## Source: local data frame [340 x 3]
## 
##             city     N highestContr
##           (fctr) (int)        (dbl)
## 1         BOSTON  1170         3500
## 2      CAMBRIDGE   632         2700
## 3      BROOKLINE   314         2700
## 4         NEWTON   206         2700
## 5  JAMAICA PLAIN   188         2700
## 6     SOMERVILLE   185         2700
## 7      ARLINGTON   173         2700
## 8     FRAMINGHAM   152         2700
## 9      LEXINGTON   144         2700
## 10 CHESTNUT HILL   142         2700
## ..           ...   ...          ...
TotalD<-RunDemocrat %>%
       select(city,AmountContr,contb_receipt_dt, contbr_zip, contbr_st) %>%
       rename(ReceiptDt = contb_receipt_dt)%>%
       group_by(city, contbr_st)%>%
       summarize(N=n(),TotalCntr = sum(AmountContr), Average = TotalCntr/sum(N))
head(TotalD)
## Source: local data frame [6 x 5]
## Groups: city [6]
## 
##       city contbr_st     N TotalCntr    Average
##     (fctr)    (fctr) (int)     (dbl)      (dbl)
## 1 ABINGTON        MA     8    901.00  112.62500
## 2    ACTON        MA    34  16581.64  487.69529
## 3    ADAMS        MA     1     16.00   16.00000
## 4   AGAWAM        MA     4    475.00  118.75000
## 5   ALFORD        MA    14    152.00   10.85714
## 6  ALLSTON        MA     2   5400.00 2700.00000
PresiD <- TotalD%>%
  left_join(zipcode, by = "city","contbr_st") %>%
  filter(state == "MA")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
require(ggmap)
maptype = c("osm")
myLocation<-"Massachusetts"
myLocation<-c( lon = -71.38244 , lat = 42.40721)
Size<- seq(from = 10, to = 3000, by = 500)

myMap <-get_map(location = myLocation, source ="osm", maptype = "osm", crop = FALSE)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=42.40721,-71.38244&zoom=10&size=640x640&scale=2&maptype=terrain&sensor=false
plot_HighMonth<- ggmap(myMap)+geom_point(data = PresiD, aes(x = longitude , y = latitude, size = Average),alpha = .5, color = "darkblue")+ scale_size(breaks = seq( from = 10, to = 3000, by = 500))+labs(title = "Hillary Clinton, Average Contributions by City")

plot_HighMonth

  1. Donald Trump
RunRepublicans<- PContri%>%
  filter(cand_nm == "Trump, Donald J.")%>%
  rename(name = cand_nm,
         AmountContr = contb_receipt_amt,
         occupation = contbr_occupation,
         city = contbr_city)%>%
  select(-receipt_desc,-memo_cd,-memo_text,-form_tp,-file_num,-tran_id,-cmte_id)

RunRepublicans %>%
  select(name,city,AmountContr)%>%
  group_by(name)%>%
  summarise(N= n(), TotalContr = sum(AmountContr))
## Source: local data frame [1 x 3]
## 
##               name     N TotalContr
##             (fctr) (int)      (dbl)
## 1 Trump, Donald J.   204   55790.39
TotalR<-RunRepublicans %>%
     select(city,AmountContr,contb_receipt_dt, contbr_st) %>%
     rename(ReceiptDt = contb_receipt_dt)%>%
     group_by(city, contbr_st)%>%
     summarize(N=n(),TotalCntr = sum(AmountContr) , Average = TotalCntr / sum(N))
 
 head(TotalR)
## Source: local data frame [6 x 5]
## Groups: city [6]
## 
##        city contbr_st     N TotalCntr  Average
##      (fctr)    (fctr) (int)     (dbl)    (dbl)
## 1     ACTON        MA     1    300.00  300.000
## 2   ALLSTON        MA     1   2700.00 2700.000
## 3   ANDOVER        MA     5   1500.94  300.188
## 4 ARLINGTON        MA     2    612.89  306.445
## 5 ATTLEBORO        MA     1    500.00  500.000
## 6   BEDFORD        MA     1    100.00  100.000
 PresiR <- TotalR %>%
  left_join(zipcode, by = "city","contbr_st") %>%
  filter(state == "MA")
## Warning in left_join_impl(x, y, by$x, by$y): joining factors with different
## levels, coercing to character vector
plot_TotalR<- ggmap(myMap)+geom_point(data = PresiR, aes(x = longitude , y = latitude, size = Average),alpha = .5, color = "darkred")+ scale_size(breaks = seq( from = 10, to = 3000, by = 500))+labs(title ="Donald Trump, Average Contributions by City")

plot_TotalR

**Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

In the state of Massachusetts, claimed by Hillary and Trump in the recent primary election, Hillary received significantly more contributions per city than did Trump, with each individual donation amounting to a greater contribution than those to Trump.

While we did not look at other states’ individually in terms of voter contributions to candidates, it is possible that Trump’s supporters are aware of his monetary success and do not see the need to contribute to his campaign.

Sources: